Answer all questions listed below. Data are provided in the excel file “Assignment 1 data.xls”.
You should put your answers together using RMarkdown (HTML output + PDF version of HTML output) to show your code and explain your results.
Please include your name in the file name of your hand-ins so that we can easily identify your work.
This is an individual assignment. Please do not copy or allow someone else to copy your work.
The deadline for this assignment is 31 Aug 2022, 2359h. Please submit both files on Blackboard > Assignments > Assignments Folder > Assignment 1.
By the end of this assignment, you should be able to:
Photo by Caleb Martin on Unsplash
In a study of squirrels, the weight of 50 female and 50 male squirrels was measured. The observations are in the columns FEMALE and MALE (Squirrels dataset).
1A. Plot the data using histogram(s) and boxplot(s).
1B. Perform a two-sample t-test and a Mann-Whitney test.
1C. What features of the plots suggest that the use of a parametric test may be unwise?
1D. From the results of the (parametric) t-test, report the t, dfs, and p-values, and interpret them.
1E. From the results of the (non-parametric) Mann-Whitney test, report the W, and p-values, and interpret them.
1F. What type of error do you get if you use a t-test?
Photo by Loren Joseph on Unsplash
An experiment was performed to compare four melon varieties. It was designed so that each variety was grown in six plots — but two plots growing variety 3 were accidentally destroyed. The data (Melons dataset) are in the columns YIELDM and VARIETY.
2A. What is/are the hypothesis/es?
2B. Plot the data so that you can see the difference in yield between the varieties.
2C. Produce descriptive statistics that show you the means of the groups, and the confidence intervals.
2D. Produce an analysis of variance of YIELDM with respect to VARIETY.
2E. Show and describe the model’s diagnostic plots.
2F. Report the F, dfs, and p-values, and interpret them.
Photo by Anton Maksimov juvnsky on Unsplash
A plant species is dioecious if each individual produces all male flowers or all female flowers. The Dioecious trees dataset contains data from 50 trees of one particular dioecious species, from a ten hectare area of mixed woodland.
For each individual, the SEX was recorded (coded as 1 for male and 2 for female), the diameter at breast height in millimetres (DBH), and the number of flowers on the tree at the time of measurement (FLOWERS).
3A. Show graphically how the number of flowers differs between the sexes.
3B. Test the hypothesis that male and female trees produce different number of flowers.
3C. Report the appropriate test statistic, dfs, and p-values, and interpret them.